Regular-expression derivatives re-examined
نویسندگان
چکیده
Regular-expression derivatives are an old, but elegant, technique for compiling regular expressions to deterministic finite-state machines. It easily supports extending the regular-expression operators with boolean operations, such as intersection and complement. Unfortunately, this technique has been lost in the sands of time and few computer scientists are aware of it. In this paper, we reexamine regular-expression derivatives and report on our experiences in the context of two different functional-language implementations. The basic implementation is simple and we show how to extend it to handle large character sets (e.g., Unicode). We also show that the derivatives approach leads to smaller state machines than the traditional algorithm given by McNaughton and Yamada.
منابع مشابه
Combining Regular Expressions with Near-Optimal Automata in the FIRE Station Environment
Derivatives of regular expressions were first introduced by Brzozowski in (Brzozowski, 1964). By recursively computing all derivatives of a regular expression, a deterministic automaton can be constructed. To guarantee convergence of this process, derivatives are compared modulo similarity, i.e. modulo associativity, commutativity, and idempotence of the union operator. Additionaly, through sim...
متن کاملDerivatives of Regular Expressions ∗
The paper proposes a characterization of the structure of derivatives, and proves several properties of derivatives. The above work can be used to solve an issue in using Berry and Sethi’s result, i. e., finding the unique representatives of derivatives. keywords: Regular expressions, derivatives, finite automata.
متن کاملPartial Derivatives of Regular Expressions and Finite Automaton Constructions
We introduce a notion of partial derivative of a regular expression and apply it to finite automaton constructions. The notion is a generalization of the known notion of word derivative due to Brzozowski: partial derivatives are related to non-deterministic finite automata (NFA’s) in the same natural way as derivatives are related to deterministic ones (DFA’s). We give a constructive definition...
متن کاملDerivatives for Enhanced Regular Expressions
Regular languages are closed under a wealth of formal language operators. Incorporating such operators in regular expressions leads to concise language specifications, but the transformation of such enhanced regular expressions to finite automata becomes more involved. We present an approach that enables the direct construction of finite automata from regular expressions enhanced with further o...
متن کاملFrom Regular Expressions to Deterministic Automata
The main theorem allows an elegant algorithm to be refined into an efficient one. The elegant algorithm for constructing a finite automaton from a regular expression is based on 'derivatives of' regular expressions; the efficient algorithm is based on 'marking of' regular expressions. Derivatives of regular expressions correspond to state transitions in finite automata. When a finite automaton ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- J. Funct. Program.
دوره 19 شماره
صفحات -
تاریخ انتشار 2009